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Abstract 

The concept of neutral evolutionary networks being a signif- 
icant factor in evolutionary dynamics was first proposed by 
Huynen et al. about 7 years ago. In one sense, the princi- 
ple is easy to state — because most mutations to an organ- 
ism are deleterious, one would expect that neutral mutations 
that don't affect the phenotype will have disproportionately 
greater representation amongst successor organisms than one 
would expect if each mutation was equally likely. 

So it was with great surprise that I noted neutral mutations 
being very rare in a visualisation of phylogenetic trees gen- 
erated in Tierra, since I already knew that there was a signif- 
icant amount of neutrality in the Tierra genotype-phenotype 
map. 

It turns out that competition for resources between host and 
parasite inhibits neutral evolution. 
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Introduction 

The influence of neutral networks in evolutionary processes 
was first elucidated by Peter Schuster's group in Vienna in 
1996 i |Huynenet al„ 1996| |Reidys et al„ 1997) . Put sim- 
ply, two genotypes are considered neutrally equivalent if 
they map to the same phenotype. A neutral network is a 
set of genotypes connected by this neutrality relationship 
on links with Hamming distance 1 (i.e. each link of the 
network corresponds to a mutation at a single site of the 
genome). It should be noted that this definition is subtly 
different from that employed in Kimura's neutral evolution 
theory ( Kimur a7'l983| l, as in that theory, neutrality is defined 
as equivalence of fitness values, a notion that is ill-defined in 
coevolutionary systems. However as phenotypically equiva- 
lent organisms are neutral in Kimura's sense when a fitness 
function exists, much of neutral theory can be carried over 
into discussion of phenotypic neutrality. 

Schuster's group noted that evolution tended to proceed 
by diffusion along these neutral networks, punctuated occa- 
sionally by rapid changes to phenotypes as an adaptive fea- 
ture is discovered. The similarity of these dynamics with the 
theory of Punctuated Equilibria (Eldridge, 1985 1 was noted 



by Barnett (1998). It was also noted that if a giant network 
existed that came within a hop or two of every possible geno- 
type, evolution will be particularly efficient at discovering 
solutions, since only a few non-neutral mutations are needed 
to reach the optimum solution. 

Most work on neutrality in evolution uses the 
genotype-phenotype mapping defined by folding of 
RNA ( Schuster et al., 1994) . This mapping is implemented 
in the open source Vienna RNA package 1 , so is a convenient 
and well-known testbed for ideas of neutrality in evolution. 

Also in 1996, I developed a definition of the genotype- 
phenotype mapping for Tierra, which was first published 
in 1997 (Standish, 1997 1. I noticed the strong presence of 
neutrality in this mapping at that time, which was later ex- 
ploited to develop a measure of complexity of the Tierran or- 
ganism ( jStandish, 19"99l|Standish, 2003) . In 2002, 1 started 
a programme to visualise Tierra's phylogenetic trees and 
neutral networks ( Standi sh and Galloway, 20 02 1 in order to 
"discover the unexpected". Two key findings came out of 
this: the first being that Tierra's genebanker 2 data did not 
provide clean phylogenetic trees, but had loops, and con- 
sisted of many discontinuous pieces. This later turned out to 
be due to Tierra's habit of reusing genotype labels if those 
genotypes were not saved in the genebanker database. This 
might happen if the population count of that genotype failed 
to cross a threshold. This is all very well, except that a ref- 
erence to that genotype exists in the parent field of succes- 
sor genotypes. The second big surprise was the paucity of 
neutral mutations in the phylogenetic tree. We expect most 
mutations to an organism to be deleterious, and so expect 
that neutral mutations will have disproportionately greater 
representation amongst successor genotypes than one would 
expect if each mutation was equally likely. 

Neutrality in Tierra 

Tierra ( |Ray, 199"T] l is a well known artificial life system in 
which small self-replicating computer programs are exe- 

1 http : //w w w. tbi .univie . ac . at/~ivo/RNA 

2 The genebanker is a database in which Tierra stores the geno- 
types that arise during evolution. 



cuted in a specially constructed simulator. These computer 
programs (called digital organisms, or sometimes "critters") 
undergo mutation, and radically novel behaviour is discov- 
ered, such as parasitism and hyperparasitism. 

It is clear what the genotype is in Tierra, it is just the list- 
ing of the program code of the organism. The phenotype 
is a more diffuse thing, however. It is the resultant effect of 
running the computer program, in all possible environments. 
Christoph Adami defined this notion of phenotype for a sim- 
ilar artificial life system called Avida ( |Adami, 1998) . In 
Avida, things are particularly simple, in that organisms ei- 
ther reproduce themselves at a fixed replication rate, or don't 
as the case may be, and optionally perform range of arith- 
metic operations on special registers (defined by the experi- 
menter). 

In Tierra, organisms do interact with each other via a tem- 
plate matching mechanism. For example, with a branching 
instruction like jmpo, if there is a sequence of nopO and 
nopl instructions (which are no-operations) following the 
branch, this sequence of Is and Os is used as a template for 
determining where to branch to. In this case the CPU will 
search outwards through memory for a complementary se- 
quence of nopOs and nopls. If the nearest complementary 
sequence happens to lie in the code of a different organism, 
the organisms interact. 

To precisely determine the phenotype of a Tierran or- 
ganism, one would need to execute the soup containing 
the organism and all possible combinations of other geno- 
types. Whilst this is a finite task, it is clearly astronom- 
ically difficult. One means of approximation is to con- 
sider just interaction of pairs of genotypes (called a tour- 
nament). Most Tierran organisms interact pairwise — very 
few triple or higher order interactions exist. Similarly, rather 
than running tournaments with all possible genotypes, we 
can approximate matters by using the genotypes stored in a 
genebanker database after a Tierra run. In practice, it turns 
out that various measures, such as the number of neutral 
neighbours, or the total complexity of an organism are fairly 
robust with respect to the exact set of organism used for the 
tournaments. 

So the procedure is to pit pairwise all organisms in the 
genebanker against themselves, and record the outcome in a 
table (there is a small number of possible outcomes, which 
is detailed in (Standish, 1997 1). A row of this table is a phe- 
notypic signature for the genotype labeling that row. We can 
then eliminate those genotypes with identical signatures in 
favour of one canonical genotype. This list of unique pheno- 
types can be used to define pragmatically a test for neutrality 
of two different genotypes, that may have generated by mu- 
tation from genotypes recorded in the genebanker. Pit each 
organism against the list of unique phenotypes, and if the 
signatures match, we have neutrality. The source code for 



this experiment is available from the author's website. 
Tierra has three different modes of mutation: 

Cosmic Ray A site of the soup is randomly chosen and mu- 
tated; 

Copy Data is mutated during the copy operation; 

Flaw Instructions occasionally produce erroneous results 

Furthermore, in the case of cosmic ray and copy mutations, 
a certain proportion of mutations involve bit flips, rather 
than opcodes being substituted uniformly. This proportion 
is set as a parameter in the soup in file (MutBitProp) — 
in these experiments, this parameter is set to zero. 

In order to study the issue of whether neutrality is greater 
or less than expected in Tierra, I generated three datasets 
with each of the 3 modes of mutation operating in isolation. 
The sizes of each data set was 69,139, 87,003 and 198,982 
genotypes respectively, generated over a time period of 
about 1000 million executed instructions. Genebanker's 
threshold was set to zero, so all genotypes were captured. 
This led to a proper phylogenetic tree. After performing a 
neutrality analysis, a set of 83, 86 and 158 unique pheno- 
types was extracted as the test set for the tournaments. 

Since the neighbourhood size increases exponentially 
with neighbourhood diameter, I restrict analysis to single 
site, or point mutations. In each data set, around 7% of these 
genotypes were created by a mutation at a single site and 
were neutrally equivalent to its parent. For each of these, 
I compute the number of neutral neighbours n, existing in 
the 1 hop neighbourhood of the parent genotype i. The 1 
hop neighbourhood size is 32 £ ', where £j is the length of the 
genome. For a given parent i, the ratio 

n = (l) 

om 

gives the proportion of neutral links actually followed rela- 
tive to the number of neutral links available (neutrality ex- 
cess), where V, is the number of neutrally equivalent off- 
spring, and o, the total number of offspring and n, the size 
of the 1 hop neutral neighbourhood. Fig. ^ shows the run- 
ning average of this quantity over these transitions, with the 
genotypes numbered in size order. 

Since all daughter genotypes are recorded, no selection is 
operating. In this case, one would expect that the proportion 
of neutral variants seen should be identical to the propor- 
tion of neutral variants within the 1 hop neighbourhood, and 
hencethe neutrality excess should be identical to 1. How- 
ever, in the case of instruction and cosmic ray flaws, not 
every daughter genome will make it into the genebanker. In 
the case of instruction flaws, it is rather unpredictable what 
the effect is. In the case of cosmic ray mutations, 50% of 

3 http://parallel.hpc. unsw.edu.au/getaegisdist.cgi/getsource/eco- 
tierra.3, version 3.D3 
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Figure 1: Running average of neutrality excess ((r,)). 
Genomes are ordered according to size, and neutrality ex- 
cess is averaged over all genomes to the left of that data 
point. Three different datasets are analysed, with each of the 
three modes of mutation turned on. Then the datasets are 
further filtered to only include offspring whose maximum 
population count is greater than 1, i.e. selection is operat- 
ing. 

time one would expect the parent to be mutated, rather than 
the daughter. In the case of a mutation affecting a crucial 
gene of a parent genotype, the organism may not be able 
to reproduce at all, thus favouring neutral mutations. Only 
copy mutations should affect all sites of the genome equally, 
leading to a neutrality excess equal to one. The measured 
value, however is about 1.3, substantially greater than one. 
The reason for this is not known at this point in time. 

The datasets were further subsetted to include just those 
transitions whose daughter organism successfully repro- 
duced, i.e. with a maximum population count greater than 1 . 
The neutrality excess in this case is substantially less than 1, 
so something in Tierran evolution is favouring non-neutral 
evolution. 

Competition Effects 

Consider a single species ecosystem with logistic dynamics: 



i = rx(l — x/K), 



(2) 



where x is the population size, r the net reproductive rate 
and K the carrying capacity. A phenotypically equivalent 
genotype attempting to invade this ecosystem will have the 
following dynamics: 



x' = rx'(l -x/K)p*0, 



(3) 



(V being the population size of the invading genotype) as 
x w K at equilibrium. So there is a substantial likelihood 
that the neutral variant fails to invade the ecosystem. 

This argument is of course an extreme case. Stochastic ef- 
fects due to finite population sizes will increase the chances 
of a neutral variant invading the ecosystem, however the 
point still remains that the neutral variant is not on an equal 
footing as the incumbent. 



In Tierra, however, there is an age structure in the popu- 
lation, with organisms being placed in a reaper queue, from 
which the oldest organisms are selected when death is re- 
quired. This fact alone implies that neutral variants of self- 
replicating organisms will successfully replicate, and hence 
cannot be responsible for the neutrality deficiency. 

However, consider a Tierran ecosystem consisting of 
hosts and parasites, where the parasite require the presence 
of a host organism within a certain distance of itself in the 
soup, in order for the parasite to replicate. Since parasitic or- 
ganisms replicate faster than the hosts (due to their smaller 
program lengths), they tend to displace host organisms until 
there are not enough hosts to go around. At which point, 
the parasite's fecundity drops. At equilibrium, the effective 
reproductive rates of host and parasite are equal. 

A neutral variant will therefore be quite likely to not have 
a suitable host in its neighbourhood to allow it to replicate. 
Consequently, neutral evolution is suppressed amongst par- 
asites. In the next section I will test this idea by setting up an 
artificial host-parasite coevolutionary system, using the well 
known RNA genotype-phenotype map. 

Vienna RNA Folding Experiments 

It is quite well known that evolution using the RNA fold- 
ing map ( Schu ster et al., 1994} exhibits a great deal of neu- 
trality, at least for a standard genetic algorithm optimising a 
well defined fitness function. Until now, evolutionary sys- 
tems based on the RNA map exhibit the unsurprising result 
of neutrality excess defined by eq {0 being greater than or 
equal to 1. I now present results of an RNA map experi- 
ment that demonstrates neutrality supression (r,- < 1), based 
on the resource competition explanation elabortaed earlier. 
We need two types of organism (host and parasite) compet- 
ing for a fixed space that can support N = 100 organisms. 
Parasites can only reproduce if they are situated next to a 
host (neighbourhood size v = 2), but reproduce twice as fast 
as the host type. 

Once an organism has reproduced, it replaces the least fit 
organism. Fitness is determined by how close the parasitic 
phenotype is to any hosts in the neighbourhood of the par- 
asites, and decreases in a similar way with the similarity of 
the parasites in the neighbourhood for host organisms: 



ie<p v 



(4) 
(5) 



where h and p are host and parasite genotypes respectively, 
My and rP v the set of hosts and parasites respectively within 
the neighbourhood of size v of p and h respectively. d(i,j) 
is the string edit distance between the phenotypes 4 , and I 



The string edit distance is related to the Hamming distance 
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Figure 2: Neutrality excesses for RNA folding host-parasite 
system as described in the text. 



is the gene length (set equal to 20 for all organisms in this 
experiment). 

The factor p adjusts the relative dominance of parasites 
over hosts. Set it too low, and hosts will eliminate the para- 
sites by virtue of replacing then when replicating. Set it too 
high, and hosts will only be competing with themselves. In 
this experiment, a value of p =3.1 was found to give inter- 
mediate behaviour. 

An alternative version of this experiment where organ- 
isms were selected at random for death, rather than accord- 
ing to a fitness relationship showed similar dynamics, al- 
though the neighbourhood size v needed to be increased to 
4 to allow a stable population of parasites to persist. 

Figure [2] shows the neutrality excess for this experiment. 
The model will consistently produce a neutrality deficiency 
for the parasites over a broad range of model parameters. 
If the parameter p is set too high, the hosts will compete 
strongly with themselves, suppressing neutrality in the host 
population also. 

Source code for this experiment is available from the au- 
thor's website. 5 

Conclusion 

The suppression of neutrality in Tierran evolution is a real 
effect. An explanation couched in terms of host parasite 
competition was found, and a model was constructed using 



(no. of base pairs that differ between two strings), but allows for 
gaps in the strings. Given a set of edit operations (eg insertionsq 
and deletions) and edit costs, the edit distance is given by the mini- 
mum sum of the costs along an edit path converting one object into 
the other. Please consult the Vienna RNA package documentation 
for a precise definition of string edit distance 

5 http://parallel. hpc.unsw.edu.au/getaegisdist.cgi/getsource/rnafold/, 
version Dl 



the well-known RNA folding map that illustrated this expla- 
nation. 

This finding is potentially important. It has been argued 
that neutral diffusion is an important feature of evolutionary 
processes allowing efficient search of phenotype space. The 
sort of competition effects seen here to impede neutral dif- 
fusion are characteristic of climax ecosystems. This would 
imply that disturbed ecosystems will have greater evolvabil- 
ity than climax systems. This "brake" on neutral diffusion 
being released during times of environmental stress could 
provide an alternative explanation for the patterns of adap- 
tive radiation seen after mass extinction events. 
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